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Abstract 

Background: Existing software for quantitative trait mapping is either not able to model polygenic variation or 
does not allow incorporation of more than one genetic variance component. Improperly modeling the genetic 
relatedness among subjects can result in excessive false positives. We have developed an R package, QTLRel, to 
enable more flexible modeling of genetic relatedness as well as covariates and non-genetic variance components. 

Results: We have successfully used the package to analyze many datasets, including F34 body weight data that 
contains 688 individuals genotyped at 3105 SNP markers and identified 11 QTL. It took 295 seconds to estimate 
variance components and 70 seconds to perform the genome scan on an Linux machine equipped with a 
2.40GHz Intel(R) Core(TM)2 Quad CPU. 

Conclusions: QTLRel provides a toolkit for genome-wide association studies that is capable of calculating genetic 
incidence matrices from pedigrees, estimating variance components, performing genome scans, incorporating 
interactive covariates and genetic and non-genetic variance components, as well as other functionalities such as 
multiple-QTL mapping and genome-wide epistasis. 



Background 

Methods to search for quantitative trait loci (QTL) in 
common experimental designs are well established, and 
software to analyze these populations is widely available. 
One popular package, R/qtl [1], provides a comprehen- 
sive toolset for QTL mapping. Since it does not allow 
random effects, R/qtl is most suitable for mapping 
populations such as F2 and backcross where individuals 
are equally genetically related. Software that can model 
polygenic effects due to genetic relatedness includes 
TASSEL [2] and EMMA [3]. Both allow covariates as 
fixed effects but are only capable of incorporating a ran- 
dom term to account for one genetic variance compo- 
nent. However, both additive and dominance modes of 
inheritance are common for many quantitative traits. 
Ignoring these variance components may result in 
excessive false positives. Moreover, researchers may also 
be interested in interactive covariates, epistasis and 
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non-genetic random effects. We have developed an R 
package QTLRel that meets all these needs. 

Implementation 

Statistical model 

Consider the following statistical model 

Y = Xfi + Qy+Zu + e (1) 

where j is a vector of phenotypes, j3 is a vector of cov- 
ariate effects, 7 is a vector of putative QTL effects, m is a 
vector of polygenic effects and e is a vector of residual 
effects. X, Q and Z are incidence matrices. )3 can be 
fixed, random or a mix of both fixed and random. 
Assume that u ~ N{0, G), s ~ N(0, lo^), and u is inde- 
pendent of £. G consists of five genetic variance compo- 
nents including additive and dominance components as 
well as three other components that model excess simi- 
larity due to inbreeding [4]. The incidence matrices cor- 
responding to these five genetic variance components 
can be obtained from condensed identity coefficients as 
defined in [[5], pp.133]. 
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Condensed identity coefficients 

While other programs are available for calculating con- 
densed identity coefficients from pedigrees [6], we pro- 
vide a function that is especially feasible for pedigrees 
with a large number of generations. Condensed identity 
coefficients can be derived from generalized kinship 
coefficients [7]. Bottom-up and top-down are two com- 
putational strategies for calculating generalized kinship 
coefficients from a pedigree. The bottom-up approach 
starts from the target individuals and moves up the ped- 
igree until reaching the founders. It requires minimal 
storage but the computational load increases approxi- 
mately exponentially with the number of generations. 
The bottom-up approach is computationally infeasible if 
both the number of generations and the number of indi- 
viduals are large. The top-down approach starts from 
founders and moves down to the target individuals. The 
computational load is approximately linear in the num- 
ber of generations. However, the intermediate general- 
ized kinship coefficients need to be stored, which may 
require extensive storage if the number of individuals in 
a generation is large. We have implemented a function 
that allows users to adjust the arguments to achieve a 
hybrid bottom-up/top-down approach, which can 
accommodate very deep pedigrees. 

Variance components 

QTLRel can estimate variance components given the 
appropriate incidence matrices. QTLRel estimates these 
variance components using maximum likelihood. These 
estimates are nearly equivalent to those obtained by 
restricted maximum likelihood for typical sample sizes. 
The maximum likelihood estimates are found numeri- 
cally using one of several methods. We default to 
Nelder-Mead since we have found it to be more 
numerically stable. QTLRel allows users to select var- 
iance components using a model selection procedure or 
perform statistical significance tests for them. 

Genome scans 

Re-estimating variance components at each marker in a 
genome scan may not be computationally feasible. The 
approach used by QTLRel is to first estimate the correla- 
tion matrix due to polygenic, residual and other random 
effects, which is based on the estimated variance compo- 
nents, and then use this matrix as known to scan the gen- 
ome. Testing fixed effects conditional on estimated 
random effects is a general approach in mixed-effect mod- 
els [2,3,8,9]. Our method is most similar to the measured- 
genotype fbced-heritability method in Aulchenko et al. [9]. 

Empirical significance thresliolds 

QTLRel implements two methods for estimating gen- 
ome-wide significance thresholds. The first is a 



permutation test in which the genotypes are permuted 
while the phenotypes and incidence matrices are held 
constant. We have previously demonstrated that when 
polygenic effects are ignored in the model type I error 
rates are inflated when a permutation is used; however, 
when the model is appropriate, permutation performs 
well [10]. The second method is gene dropping which 
can appropriately control type I error rates even when 
polygenic variation is ignored in the model. 

Results 

QTLRel has been successfully used in an AIL to identify 
QTL for methamphetamine sensitivity [10], muscle 
weight [11], prepulse inhibition [12] and body weight 
[13]. In the analysis of body weight we calculated all five 
genetic incidence matrices, but only estimated additive 
and dominance variance components because the other 
variance components are negligible in general [14]. We 
then performed a genome scan using 688 individual 
genotyped at 3,105 SNP markers. We identified 11 QTL 
that exceeded the .05 genome-wide significance thresh- 
old estimated from 1800 gene dropping samples. These 
11 QTL were confirmed as distinct signals using a for- 
ward step-wise multiple-QTL mapping function imple- 
mented in QTLRel. We also investigated genome-wide 
epistatic effects but found none. The analysis was 
accomplished on an Linux machine equipped with a 
2.40GHz Intel(R) Core(TM)2 Quad CPU. It took 295 
seconds to estimate variance components and 70 sec- 
onds to perform the genome scan. 

Conclusions 

QTLRel provides a toolkit for genome-wide association 
studies that is capable of calculating genetic incidence 
matrices from pedigrees, estimating variance compo- 
nents, performing genome scans, and estimating signifi- 
cance thresholds. It can model interactive covariates and 
multiple genetic and non-genetic variance components. 
Other functions include multiple-QTL mapping and 
genome-wide epistasis. QTLRel can perform interval 
mapping based on the Haley-Knott method [15] for 
markers with 2 alleles. Because QTLRel is implemented 
in R users can take advantage of numerous other statis- 
tical packages; however, there is room to improve on 
QTLRel's speed since it makes use of many intermediate 
R functions. It is our intention to extend some function- 
alities, e.g., the Haley-Knott method to markers with 
more than two alleles. In summary, QTLRel provides a 
stand-alone, comprehensive tool to perform QTL ana- 
lyses in populations in which relatedness is a concern. 

Availability 

QTLRel is an R package. It is publicly available on R 
CRAN http://cran.r-project.org/web/packages/QTLRel 
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for Windows, Linux and Mac machines under the GNU 
GPL license. A tutorial is available in the package as 
well as on the Palmer lab web page http://www.palmer- 
lab.org/ software. 
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